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Abstract We address finding the semi-global solutions to optimal feedback 
control and the Hamilton-Jacobi-Bellman (HJB) equation. Using the solution 
of an HJB equation, a feedback optimal control law can be implemented in real¬ 
time with minimum computational load. However, except for systems with two 
or three state variables, using traditional techniques for numerically finding a 
semi-global solution to an HJB equation for general nonlinear systems is infea¬ 
sible due to the curse of dimensionality. Here we present a new computational 
method for finding feedback optimal control and solving HJB equations which 
is able to mitigate the curse of dimensionality. We do not discretize the HJB 
equation directly, instead we introduce a sparse grid in the state space and use 
the Pontryagin’s maximum principle to derive a set of necessary conditions in 
the form of a boundary value problem, also known as the characteristic equa¬ 
tions, for each grid point. Using this approach, the method is spatially causality 
free, which enjoys the advantage of perfect parallelism on a sparse grid. Com¬ 
pared with dense grids, a sparse grid has a significantly reduced size which is 
feasible for systems with relatively high dimensions, such as the 6-D system 
shown in the examples. Once the solution obtained at each grid point, high- 
order accurate polynomial interpolation is used to approximate the feedback 
control at arbitrary points. We prove an upper bound for the approximation 
error and approximate it numerically. This sparse grid characteristics method 
is demonstrated with two examples of rigid body attitude control using mo¬ 
mentum wheels, keywordsoptimal feedback control Hamilton-Jacobi-Bellman 
equation sparse grid method of characteristics rigid body attitude control 
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1 Introduction 


The optimal feedback control of nonlinear systems is a challenging problem. 
Using dynamic programming, the feedback optimal control law is constructed 
based on the solution of a partial differential equation (PDE) that is called 
the Hamilton-Jacobi-Bellman (HJB) equation. This theoretically elegant ap¬ 
proach suffers some difficulties in computation due to the curse of dimensional¬ 
ity, a term that was coined by Richard E. Bellman when considering problems 
in dynamic optimization, which relates to the fact that the size of the dis¬ 
cretized problem in solving HJB equations increases exponentially with the 
dimension. Finding an approximate solution to HJB-type of equations in a 
local neighborhood of a trajectory has been extensively studied, see for exam¬ 
ple Al’brekht [I|, Cacace et al. H, Kang et al. [1^, Lukes [2^, Navasca and 
Krener , Falcone and Ferretti[lll| , and the references therein. Some of the 
previously proposed methods can be applied to systems with high dimensions. 
However, finding semi-giobal solutions to HJB equations, i.e., solutions satis¬ 
fying a required accuracy in a given domain, faces the curse of dimensionality. 

We present a new computational approach to finding semi-global solutions 
of optimal feedback control and HJB equations for nonlinear systems. The 
method is designed to mitigate the curse of dimensionality. Our approach does 
not discretize the HJB equation directly, instead we introduce a sparse grid in 
the state space and use Pontryagin’s maximum principle (PMP) to derive a set 
of necessary conditions in the form of a boundary value problem (BVP), also 
known as the characteristic equations, for each grid point independently. For 
details on PMP and its relationship with HJB equations, the reader is referred 
to Pontryagin [2^, Barron and Jensen Q, and Bardi and Capuzzo-Dolcetta 
[2|. The curse of dimensionality is mitigated by using sparse grids. The idea 
of sparse grids is based on Smolyak’s work The reader is also referred to 
Bungartz and Griebel Garcke [l3|, and Zenger for more details and 
different perspectives. Relative to a dense grid, the size of the sparse grid is 
significantly reduced. In the propose computational method, the solution at 
each grid point is found using a Lobatto Hla method that solves a two-point 
BVP. Different from many existing algorithms of solving PDFs, this approach 
is not based on spatial causality. A significant advantage of this causality 
free method lies in its perfect parallelism, a desirable property for modern 
computational equipment with many-core clusters. Some results on the rate 
of convergence and the upper bound of approximation error are proved. 

The method is exemplified by two numerical examples of rigid body atti¬ 
tude control using momentum wheels. In this case, the HJB equation is six 
dimensional. The second example in itself is interesting, in which the system 
has two pairs of momentum wheels. The system is uncontrollable, which makes 
optimal feedback control difficult. To the best of our knowledge, no solutions 
have been found for the HJB equation of this problem. 
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2 Problem Formulation 

An optimal feedback control law is a function 

u*{t, x) 

that minimizes the cost functional 

Lit, x{s),u{s)) ds + hix{T)) 

subjecting to the control system 

x{s) = /(s, a;(s), u{s)) (t <s <T), (la) 

x{t) = X. (lb) 

In this formulation, x : {t, T) R" is the state, u : (t, T) —>■ K™ is the control, 
L : (t, T) X K” X A —>■ R is running cost, h : (t, T) x R" R is the final cost, 
/ : (t, T) X R" X A ^ R" is a bounded, Lipschitz continuous function, and 
A C R"* is a compact set. Here t G R is the initial time and x G R" is the 
initial condition. For the simplicity of discussion, we assume that the final 
time, T G R, is fixed. Following the standard approach in optimal control, we 
define the Hamiltonian 

H (t, X, A, u) = Lit, X, u) + A^/(t, X, u) 

where A G R" is the costate and u G A. The function 

u*it,x,X) = arg min Hit,x,\,u) 

U 

minimizes the Hamiltonian. The value function is defined as 

Vit,x) = \ni [ L(s, a;(s), m(s)) ds + h(a;(r)) (0 < t < T), 

“ Jt 

where a;(s) satisfies the identity (HD. Further V satisfies the HJB equation, a 
PDF, with a final time condition given as 

Vtit,x) + H* it,x,Vjit,x)) =0 (x G R",0 < t < T), (2a) 

ViT,x)=hix) (xGR"), (2b) 

where H*it,x,X) = Hit,x, X,u*), Vt = and 14 = fj- If equation (HJ can 
be solved, the feedback control law is a function defined as 

u*it,x) = u* it,x,vjit,x)) . (3) 

The design and control of engineering systems involve both on-line and off¬ 
line computations. In the method proposed here, the HJB equation is solved 
in the off-line design phase. Once the solution of the HJB equation is found on 
a sparse grid, the on-line computation for real-time feedback control is carried 
out using interpolation, a numerical process that is simple and reliable. This 
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approach offsets the main computational load from on-line computation to an 
off-line design phase. In addition, off-line computations allow one to use more 
powerful computers than those onboard systems such as satellites or unmanned 
vehicles. Because of the causality-free property, which is explained later, using 
parallel computers significantly reduces the required computational time when 
solving a HJB equation. In this paper, we make the following assumption. 

Assumption 1 The optimal control is uniquely determined by the PMP at 
each point in the state space. 

If H{t,x, X,u) is a strictly convex function of {x,u) over an open convex set 
containing all the admissible values of (x, u) at fixed t and A, Assumption [T] 
holds true. In general. Assumption [T] can be guaranteed based on necessary 
and sufficient conditions of optimal control, which has a vast literature of pub¬ 
lications, including both classic and viscosity solutions @,111. For problems 
with a proved existence and uniqueness of solutions, finding the optimal feed¬ 
back control for practical real-time applications is still a difficult problem. If 
a system has four or more state variables, finding a desecrate solution and 
implementing real-time interpolation for feedback control suffer the curse of 
dimensionality. The main contribution of this paper is that, for a well-defined 
problem of optimal control with a moderate dimension, it is possible to achieve 
optimal feedback control using interpolation on a discrete approximate solu¬ 
tion. 


3 The Sparse Grid Characteristics Method 

In many numerical methods for HJB equations, which are typically solved 
backward in time, the discretization is based on spatial causality and the 
computation is explicit in time. The value of the solution function V(t, x) at a 
grid point is computed at an earlier time using the known value of the function 
at neighboring grid points at a later time. This coupling usually comes from the 
discretization of the spatial derivatives. For HJB equations of high dimensions, 
in our examples the dimension d = 6, solving the equation using traditional 
algorithms based on dense grids is computationally challenging. For instance, 
in a six dimensional space, if 2® = 32 grid points are used to approximate a 
single variable, which is quite small, the total number of grid points for a 6-D 
problem is over 10®. If 100 points are used for a single variable, then the size 
of the dense grid is 10^®. The curse of dimensionality is a bottleneck problem 
in solving HJB equations for practical applications. 

To mitigate the curse of dimensionality, we introduce a causality free com¬ 
putational method. It consists of two components: (1) A solver that can find 
the optimal control and the value of V{t, x) at any grid point; the computation 
is independent of the value of V{t,x) at other points in the state space, i.e., 
causality free. (2) A set of grid points, such as a sparse grid, with a reduced 
size to make the problem tractable. The causality free method introduced in 
this section is based on BVP solvers and sparse grids. The goal is to solve the 
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problem of optimal control and its associated HJB equation with a moderate 
dimension. In this paper, we use a 6-D example to illustrate the algorithm. 

Why causality free ? From a conventional viewpoint of computation, solving 
a two-point BVP is not an efficient approach for PDFs. However, the causal¬ 
ity free method is perfectly parallel. In fact, each grid point can be assigned 
a CPU core. The computation at a grid point requires no communication 
with the computation process at other grid points. Although not preferred in 
a serial computational environment, causality free algorithms can easily be 
implemented in massively parallel computers. In addition, causality free al¬ 
gorithms are ideal for sparse grids in which the space between adjacent grid 
points varies significantly. The combination of sparse grids, BVP solvers, and 
parallel computation makes it possible to mitigate the curse of dimensionality 
effectively for problems in which d is not too large. 


3.1 A causality-free method using the necessary condition of optimal control 

In contrast to traditional PDE discretizations, our proposed technique does 
not discretize the HJB equations directly but instead uses the PMP to derive 
a set of necessary conditions in the form of a BVP for each grid point, also 
known as the method of characteristics. As a result, the computation of the 
solution at an initial point in space is independent of other points. 

The optimal trajectory for the optimal feedback control law described in 


Section [5] is a solution of the two-point BVP 

x{s)= (^^{s,x{s),\{s),u*{s,x{s),X{s)))j , (4a) 

Ms) =-(^^{s,x{s),\{s),u*{s,x{s),X{s)))j , (4b) 

i(s) = L(s, x{s), u*{s, x(s), A(s))), (4c) 

where t < s <T with the boundary conditions 

x(t) = X, (4d) 

zit) = 0. (4f) 

The optimal control and the minimum costs are 

u*{t,x) = u*{t,x,X{t)), V{t,x) = z{T) + h{x{T)). (5) 


Given any grid point, x, we can solve the BVP (jH) and use the identities ® 
to find the optimal control and the corresponding minimum cost without us¬ 
ing the value of V (t, x) in any nearby points, i.e., the computation is causality 
free. Numerical algorithms for solving BVPs similar to ([1]) have been studied 
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by many authors. In the examples, we adopt an algorithm based on a four- 
point Lobatto Ilia formula for our BVP solver. This is a collocation formula 
and the collocation polynomial provides a solution that is fifth-order accurate 
(see Kierzeiika and Shampine [20| for more detail). We would like to point 
out that the computation is not limited to the Lobatto Ilia BVP solver. The 
problem of optimal control at each grid point can be solved using any compu¬ 
tational method. 


3.2 Sparse grids 

In the approximation of multivariable functions, sparse grid interpolation is 
an approach in which the approximation is build on a subset of a dense grid 
with a significantly reduced size. Sparse grids are derived from the Smolyak’s 
construction . Although the original idea was invented more than fifty years 
ago, some recent work reveals potentials of its applications liflMllSlsii. It 
is a known fact that the size of sparse grids increases with the dimension, d, 
on the order of 

0(V(log7V)"-'), 

which is in sharp contrast to the size of the corresponding dense grid 

Here N = 2'^“^ where g is a measurement of the level of refinement of the 
sparse grid. Obviously, the significantly reduced number of grid points has its 
impact on the accuracy. For example, an upper bound of interpolation error 
using a classic sparse grid satisfies 

||e||^. =0(iV-^(loglV)^-i) 


for all functions with bounded mixed derivatives up to the second order. Com¬ 
pared with the error bound using a dense grid, which is 0(V“^), we pay a 
small price in terms of accuracy for problems with a moderate dimension and 
we achieve a significantly reduced size of the grid. 

Sparse grids have a hierarchical structure. For each variable, the set of grid 
points contains several layers of subsets, denoted by Ah Let W be the number 
of points in A*. These subsets are assumed to have a telescope structure, 
A*“^ C A* for i > 1. For illustration purposes, we exemplify the definition of 
the classic sparse grid using equally spaced nodes in [0, as 


w = + 1, 


A* = 


k-1 

Oi — 1 



( 6 ) 


for i > 1. 
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The set of points in X* but not in X* ^ is denoted by Z\X®; and the number 
of points in the set is i.e., 

AX^ = X\ 

AX‘ = X* - (i > 2), 

^N,= \AX^\ 

In this paper, the points in X® are represented by 1 < j < Ni, and the 
points in AX'‘ are represented by /^x'^, 1 < j < i.e., 

AX'^ = ..., 

for i > 1. In the sequel, we adopt the multi-index notations 

i = [*1 *2 • • • id\ , 

|i| = *1 + *2 + • • • + *d, 

A — (Ai A-i Ad\ 

Xj 

AX'^ = X X ... X AX^-^. 


The dense grid build on X'^ for an integer q > 0, denoted as is 


GLnse=X‘^X---xX^= U 


Following Smolyak’s approximation algorithm [3, 271, the sparse grid, denoted 
by Gfparsei IS defined as 

= U 

|i|<9 


sparse 


Two plots of 2-D sparse grids are shown in Figure [T] for q = 6 and 8. If 
<7 = 8, Glpai-se has a total of 385 grid points whereas the corresponding dense 
grid, Gjgjjgg, has (2® + 1)^ = 4225 points. The difference becomes increasingly 
significant for higher dimensions. 

Sparse grids can be build using other nested sequences of grids X®, i > 1. 
For instance, a modified sparse grid is defined using 


f iVi = 1, = 

I Ni = 2®-i -f 1, X® = 



k = 


1 , 2 , 


.N,, 


(7) 


for i > 1. Except for i = 1, it is identical to the classic sparse grid ([5]) . The size 
of Gsparse Is further reduced. The modified sparse grids for q = 6 and q = 8 
are shown in Figured] When q = 8, the total number of points is 321. 
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The Chebyshev Gauss-Lobatto (CGL) sparse grid is defined in a similar 
way. The the grid points are defined as 


1—1 

II 

II 

N, = + 1, 

II 



{k — l)7r\ 

2*-i J 



( 8 ) 


for i > 1. Two examples of this grid are shown in Figure [31 


3.3 Interpolation on sparse grids 

Given a problem of optimal control, its solution on a sparse grid can be com¬ 
puted off-line. Then in real-time feedback control, the value of minimum cost, 
V(t,x), and the costate, A(t), can be computed using interpolation on the 
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Fig. 4 Basis functions for the classic, modified, and CGL sparse grids 


sparse grid. For simplicity, this section will focus on interpolating V (t, x). Con- 
sider X* C [0,1], t > 1. A basis function, a%{x), for a point i G A* is defined 
on [0,1] satisfying 

{ 1, X = x\ 

. 

0, x€X\x^x. 

For ^x'j G Z\A® C , we define a simplified notation 

a]{x) = 


Figure 0] shows a few basis functions for the various sparse grids. 
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The classic sparse grid uses piecewise linear basis functions 

a];(a;) = x, 

= l-x, 

l-\2^-^x-{2j-1)\, xG 
0, otherwise; 


2(j - 1) 2j 
22—1 ’ ’’ 



(9a) 

(9b) 

(9c) 


for i > 2 and 1 < j < 2® For the modified sparse grid, a® (x) is the same as 
in identity (15^ if i > 3 and otherwise 


= 1 , 


alix) 

ajix) = 

alix) = 



(10a) 

(10b) 


(10c) 


For the CGL sparse grid, the Lagrange polynomials form the basis functions 


ajix) 


n 

x'^X' 

x'^x] 


X — X* 

~i 7 ’ 

Xj — 


for i > 1 and 1 < j < 2* 

In the following, the inequality j < implies 


jl < Zi , J2 < : • ■ • ) jd< ■ 


The inequality k < A^i is similarly defined. The interpolation on a sparse grid 
does not need every basis function. In fact, for each i > 1, an interpolation 
function uses only those a\. for which x S AX'^. Let be the interpolation 

of / at grid points of Glp^rse- It is defined recursively on [0,1]'^ as 

l‘^-\f)=0, (11a) 

+AI^f), q>d, (11b) 

^2:«(/) = ^ ^ w]a]\ (8> • • • (8> , q>d, (11c) 

|i|=9l<j<Zi^‘ 

u;]=/(x])-I'^-i(/)(x]), (lid) 


(g) • • • (g) (xi,..., Xd) = (xi) ■ • • (x^). 


where 
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The weights, tp!, are scalars called hierarchical surpluses. 

An alternative formulation uses basis functions a%{x) for i € Ah The 
corresponding basis functions are denoted by 

ulix) = 

for G Ah Note that the subindex in this notation, k, represents the index in 
A®, not in Z\A®. Again, we use piecewise linear basis functions for the classic 
and modified sparse grids as in the definitions ([9]) and (nni). Similarly, we use 
Lagrange polynomials as the basis functions for the CGL sparse grid. For the 
A® defined in the equations (H])-®, we have the relation 

a®(x) = ulj_-^{x), 

for 1 < j < - interpolations in [0,1] and [0,1]^ are 

Ni 

ll\f)=J2fixl)ul, 

k^l 

l<k<7Vi 


Let X‘i{f) be the interpolation on sparse grids. Following the formulation in 
Barthelmann et al. Q, Delvos and Wasilkowski and Wozniakowski [^ . 
the interpolation on the sparse grid, in terms of the new basis is 


q-d+l<\i\<q 


“EE (-1)’^"' 

q-d+l<\i\<q l<k<lVi 


d-1 

9-|i| 


f{xl)ul\ <s> ull (g) 




( 12 ) 


Before the next section on error analysis, the algorithm is summarized as 
follows. The off-line computation has three steps. 

Step I Generating a sparse grid, and its basis functions 

J j _ 

Step II Solve the two-point BVP defined in equation (|3]) at each 
grid point. 

Step III Generating the hierarchical surpluses, {w®}, using (fTTIl or 
other equivalent formulae. 

The output of the process consists of the values of \{t) and V{t, x) at grid 
points in G|pjjj.gg. For on-line optimal feedback control, the value of these func¬ 
tions at an arbitrary x is approximated using interpolation on the sparse grid. 
Then the control input © is computed for real-time control and operation. 
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For a feedback using model predictive control, like the examples in Section [SJ 
the feedback requires the value of \{t) at t = 0 in each time interval. It is not 
necessary to solve the problem for t 7 ^ 0. In the case of interpolation at t > 0, 
which is not addressed in this paper, the time variable can be included in the 
sparse grid as another dimension. 


4 Error Analysis 

Let represent the value of V{t,x) evaluated at x = Xy i.e., 

V^ = V{t,xl). 

A causality free algorithm, such as the numerical solution of the characteristic 
equations (H))-®, approximates with an error 

\f = \f + 4. (13) 

At an arbitrary point x, the approximation based on sparse grid interpolation 
is 

X?(y)(x) = J9(y)(x) + eBVP, 

where eevp is the error due to ej, the numerical error of the solution of the 
BVP ®-®. More specifically, 

CBVP = H H |j|^ elul\ ul^^. (14) 

9-d+l<|i|<g l<k<Ari ^ ' 

Relative to the true value, the interpolation process has an error 


Therefore, 


einterp = X'?(R)(x) -V{t,x). 


X^{V){x) = V{t,x) + einterp + BbYP- 


(15) 


4.1 Error upper bounds 

For a survey of sparse grids and error estimation, the reader is referred to 
Bungartz and Griebel [g and Garcke [l^. In the case of a classic sparse grid, 
applying a piecewise linear interpolation to functions with continuous second 
order partial derivatives, einterp is at the order of 

||ei„terp||=o(^^i^^^^^ (16) 

which holds both for the L^- and L°°-norm. 

The estimate in identity (111)11 holds for both || einterp 1 1^,2 and 11 einterp || 2 ,cxi ■ In 
the case of a GGL sparse grid, an error upper bound of einterp is proved by 
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Barthelmann et al. Q . Suppose a function / has fcth order continuous partial 
derivatives. Define the norm 




= max< 


alii 


dx^ • ■ • 


f 




then the interpolation of / on a CGL sparse grid satisfies 


||einterp|li4/fc,cxj — O 






where M is the number of sparse grid points. 

In this section, we prove an upper bound for eevp- Its value is small if the 
dimension, d, is one or two. However, the error becomes larger when solving 
PDEs with a higher dimension. In our examples of d = 6 and cbvp is not 
negligible. Let’s define 

Ni 

Ai = max y Wllx) 

/c=l 

for i > 1. For polynomial interpolations, this number is the Lebesgue constant. 
For any integer I > d, we define 

^ ^ii ^42 ■ ' ■ -^id ■ 

\i\=l 


Theorem 1 (i) Suppose e > Q is an upper bound of the numerical error at 


each grid point, i.e., 


in equation (US are smaller than e. Then 


|eByp|lLoo<e ^ IIS'; 




q-l 


(ii) Suppose Aq > 0 is a constant such that 


Then 


A,<Aq {l<i<q-d+l). 

W^^bvpWl^ 


0({\0gNf-^Afj 


(17) 


(18) 


where N -\-\ = 2^ '^ + l*s the number of grid points in each dimension. 
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Proof (i) It is easy to check that 

Ni Na 

|i|^/ l<k<A/'i |i|—^ ki — 1 kd — 1 

|i|=i \fci = l / \fcd = l / 

— ^ ^ ^11^12 ■ ■ ■ -^id 
\i\=l 

for all X G [0, l]"^. Therefore, given any integer I > d we have 

E E l^fci ® ® ® < Si- (19) 

|i|=i l<k<Ari 

Then the upper bound (fTTll is a corollary of equations (HD), (HU)) and (fT^ . 
(ii) To find the order of ||eBVp||/e, we first note 

Si = 'y ) ■ ■ ■ Aij^ 

\i\=l 


|i|=Z 



{i-dy. 

<{i- 

From identity (fT7)) . 



= 2‘‘-\q-lf-^A'^g. 


Then the relation (1181) follows from the fact 

q = d + log 2 N = 0(log N). 


□ 
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Corollary 1 For the classic or the modified sparse grid and piecewise linear 
interpolation, 

=o((logiV)-^-i). ( 20 ) 

For the CGL sparse grid and polynomial interpolation, 

^ Q . ( 21 ) 

Proof The basis functions of the piecewise linear interpolation have the follow¬ 
ing property 

Ag = 1. 

The Lebesgue constant for the CGL grid points is bounded by (see for example 
Hesthaven et al. 0 ) 

An < - log iV -h — ( 7 -I- log - ) -I- — log 2 = O (log N) (22) 

TT TT \ TT y TT 

where 7 is Euler’s constant. Substitute Aq into equation (fl^ to yield equa¬ 
tions (EHl) and (EH). □ 


4.2 Numerically estimate eevp 


Some numerical examples show that the upper bound in inequality ()17l) tends 
to be conservative. Using relations EZD and (EH, we can Hnd an error upper 
bound for the CGL sparse grid Glp^j-se- For example, if 5 = 13 we have 


||eBVp|lL°° 

€ 



Si < 3.66 X 10 ^ 


(23) 


In this estimation, Cbvp = ej each grid point is assumed to be the maximum 
value, e. It is a conservative assumption. As an alternative, we can assume ej 
is a random variable. (This approach is commonly used in uncertainty quan- 
tihcation, rightly or wrongly, to get a handle on the model error.) We will do 
this here to get another estimate of bbyp- 

From identity d, bbvp has two special properties. Firstly, the value of 
Cbvp is not directly dependent on f{x). The error is solely based on the type 
of grid, the interpolation basis functions, and the distribution of ej. Therefore, 
the estimation of Cbvp he done off-line for a given grid. The result is then 
applicable to a family of problems. Secondly, Cbvp is a linear function of el. 
The estimate of cbvp for a given distribution of ej is applicable to a different 
distribution of the same type through a simple rescaling. 

Given a sample set of ej, bbvp can be computed using equation (|14|) at any 
X in [0,1]"^. Finding the probability distribution of ej is not a problem to be 
addressed in this paper. As an example, we assume that ej are independent 
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random variables with uniform distribution in [—e, e]. Let |ej| be a sample 
data set with a uniform distribution in [0,1]. After rescaling, equation (THl) 
implies 


eBVP 

e 


E E (-ir'" 

ij-d+l<|i|<i3 l<k<Ari 





u 

ki 


(8> u 


*2 

fc2 




In an numerical example, a set of |Gfparse| = 44,689 random numbers are 
generated as the sample value, ej, in [0,1]. At iV = 2,000 random points in 
[0, l]'^, we found 


eBVP 


< 66.60. 


(24) 


The bound in (I24|) is much smaller than the upper bound in the inequality (1231) 
derived from Theorem [TJ We would like to emphasize that this practical way 
of estimating bbvp is independent of the function to be approximated. The 
overall error Cjnterp + gbvp can be approximated in a similar way based on 
the assumption that e, the error upper bound of BVP solver, is known. This 
point is discussed in the next section and exemplified in Example I. A thorough 
numerical analysis of errors is outside the scope of this paper. By no means can 
the examples in this paper lead to a general conclusion about the error upper 
bound. However, the examples in this paper present a practical approach of 
using Monte Carlo simulations to analyze bbyp and Cinterp + cbvp- 


4.3 Numerically estimate Cinterp + cbvp 

What we ultimately care about is the total error in an approximate of V {t, x), 
which is Cinterp + cbvp in equation (1151) . For causality free algorithms, such 
as solving the BVP (|1])-(1S]), numerical errors do not propagate in space, i.e., 
the value of V {t, x) can be computed independently from the approximation 
error at other points. This special property of causality free algorithms has an 
important implication. A BVP solver with accurate error control can be used 
to approximate the error of a numerical solution of the PDE. There is a sizable 
literature of numerical BVP algorithms. Some approaches are able to control 
the error within a given tolerance. For the examples in this paper, we use a 
four-point Lobatto Ilia formula which can be implemented with a controlled 
true error [^. Given a numerical solution on G|pj,jgg, 

V/ ~ V{t,x]). 

The approximate of V(t, x) at any point x is obtained by interpolation 

Vit,x)^I^V). 

Meanwhile, a BVP solver with error control can be applied to solve the equa¬ 
tions 0-® at the same point. Suppose the solution is V{t,x). The error 
tolerance is set so that its true error is much smaller than Cinterp + sbyp- To 
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find the distribution of einterp + sevPi & set of sample points is generated in 
the state space, organized or random. The numerical error at these points can 
be approximated using 


Icinterp + eavpl = \l‘^{V){x) -V{t,x) 


T^{V){x)-V{t,x) 


Although the computation of V(t, x) can be slow, the entire process is perfectly 
parallel because of its nature of being causality free. In the examples below, 
128 CPU cores are used to approximate the error at 1280 random points. The 
error tolerance of V{t,x) is set at 10“^ or 10“®. The numerical results show 
that this tolerance is smaller than [cinterp + eevpl by at least three orders of 
magnitude. 


5 Examples 

In this section, two examples of optimal attitude control are presented. The 
system model represents a rigid body controlled by momentum wheels. The 
first example is a system with three pairs of controllable momentum wheels. 
The second example is an uncontrollable system with two pairs of momentum 
wheels. The uncontrollable case in itself is interesting. The numerical result is 
fundamentally different from existing controllers in the literature. 

Consider a rigid body system. Let { 61 , 62 , 63 } be an inertial frame of or¬ 
thonormal vectors and let { 6 },e 2 ,e 3 } be a body-fixed frame, or body frame. 
In this pap er, the attitude of a satellite is represented by Euler angles (see 
Diebel |l0| for a good introduction to representing attitude) 

V = \(j) 9 

in which 0 , and tp are the angles of rotation around e}, 63 , and 63 , respec¬ 
tively, in the order of (3,2,1). The angular velocity is a vector in the body 
frame, 

U! = [wi a;2 a;3]. 

The control system using momentum wheels is defined by a set of differential 
equations Q 


V = E{v)oj, (25a) 

Ju = S{uj)Riv)H + Bu, (25b) 

where i? G is a constant matrix in which m is the number of control 

variables, u G R"* is the control torque, J G is a combination of in¬ 

ertia matrices of the rigid body without wheels and the momentum wheels, 
iL G is the total and constant angular momentum of the system, and 
, S{u}), R{v) G R^ are matrices. Details can be found in Kang and Wilcox 
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5.1 Example I 


The system has three pairs of control momentum wheels, m = 3. In the 
model ipSl) . the following parameter values are used 


ri ^ 

7 2010 


'2 0 o' 


T 

— 1 — 

1,5 ^ 10 

, J = 

0 3 0 

H = 

1 

— — 1 

LlO 15 J 


0 0 4 


1 


The optimal control is 


argmin / L{v,uj,u) ds + Wi\\v{T)\\^ + W^\\ui{T)\\^ 
U Jt 


where 


= 1 , 


r/ ^ II l|2 I ^2|| ||2 . W^3|| ||2 

L{v,uj,u) =—\\v\\ +—||w|| +—||w||, 

11/2 = 1 , W3 = 1I2, Wi = l, W5 = l, t = 0, 


T = 20. 


Since L is a convex function it can be proved that a unique solution exists 
in a neighborhood of the target state. The solution V{t,v,uj) is computed at 
t = 0 for initial states u(0) and w(0) in two domains, Di and D 2 , of different 
size, 


Di — |u,a; G 
D2 = |u,a; G 


and 

6 6 
-^ < and 


TT TT 1 

- < a;i,a;2,W3 < 

7 T TT 1 

- < a;i,a;2,a;3 < 


The computation is based on the CGL sparse grid with q = 13. The number 
of grid points for each dimension is 2'^“® + 1 = 129. The total number of grid 
points in the 6-D domain is 


I '“^sparse 


44,689, 


which is small in comparison with the size of a dense grid, 


IG^nsel = 129® > 4.6 X 10^^ 


In the computation, the two-pont BVP (|1]) is solved at each grid point in 
G^parse using a four-stage Lobatto Ilia method The error tolerance is 
10“^^. The computation is carried out in Hamming, a parallel computer of 
Naval Postgraduate School. Although as many as |G|parse| can be used, we 
limit the computation to 512 CPU cores. To check the accuracy of the over¬ 
all solution, the upper bound of Cinterp + eevp is numerically approximated 
using the method in Section 14.31 More specifically, 1280 points are randomly 
generated in Di and 02- At each sample point, the value of V (0, v, ui) is com¬ 
puted using interpolation. The true value at the same point is approximated by 
V{0,v,uj), which is computed by applying the BVP solver to the equation (U) 
with an error tolerance 10“® in Di and 10“^ in 02- This tolerance is much 











Sparse Grid Characteristics Method for Optimal Control and HJB Equations 


19 


Domain 

1 

I^Lnsel 

\r^ 1 

1 '-^sparse | 

MAE 

Relative MAE 

Di 

13 

> IOI2 

44,698 

4.9 x 10-’^ 

4.0 X 10-’’ 

D 2 

13 

> IOI2 

44, 698 

3.6 x 10-3 

7.3 x 10-4 


Table 1 Summary of results for Example I 



time 



time 


Fig. 5 Trajectories (Solid (p, o^i, ui; dashed r I 9, u) 2 -, U 2 ', dotted r \ ip, CJ3, U3) 


smaller than Cinterp + sbyp so that the error of the BVP solver can be ignored. 
The difference, V (0, v, w) —y(0, v, w), is an approximate of einterp+CBVP at the 
sample point. In Di, the mean absolute error (MAE) is 4.9 x 10“^. The MAE 
for the relative error is 4.0 x 10“^. In the larger domain D 2 , the MAE equals 
3.6 X 10“^ and the relative error is 7.3 x 10“"^. The results are summarized 
in Tabled] Figure [Sj shows a typical trajectory in which (v,uj) converges to 
zero, i.e., the rigid body is stabilized. The accuracy of the solution, especially 
in £> 2 , is not as high as PDE solvers in some 2-D or 3-D cases. This is not 
surprising because we sacrifice the accuracy in exchange for a sparse grid that 
is tractable in 6-D. For a fixed dimension d, the accuracy of the solution at 
an arbitrary point depends on q and the size of the domain. For a fixed q, 
the accuracy is increase when the size of the domain is decreased. Shown in 
Tabled! decreasing the linear dimension of D 2 by 50% increases the accuracy 
by three orders of magnitude. 
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5.2 Example II 


In the following, we consider a rigid body controlled by two pairs of momentum 
wheels. Although satellite systems are quipped with at least three pairs of 
control momentum wheels, malfunction may occur in some wheels. It is proved 
by Crouch Q that this system is uncontrollable. How to stabilize the satellite 
around a desired attitude is a challenging problem. Related work can be found 
in Gui et al. 0 , Horri and Hodgart flfl. Kim and Kim [U, Krishnan et al. 
[^ . Terui et al. [2^, Tsiotras and Liio|2^ and references therein. Different 
from exiting results, we do no assume zero angular momentum. In addition, 
the controller based on the solution of the associated HJB equation is smooth. 
The optimal control is able to smoothly stabilize the rigid body at an attitude 
that is closest to the desired orientation. The result in this paper is different 
from those in Kang and Wilcox 0 where the stabilization does not guarantee 
an optimal attitude. In the present paper, we optimize a cost function that 
automatically stabilizes the system at an optimal attitude. In addition, the 
result is integrated with a model predictive control (MFC) to achieve feedback 
stabilization in the presence of noise. 

In this section, the following values are assigned to the parameters 


B = 

1 

0 

J = 

'2 0 o' 

0 3 0 

, H = 

'12' 

12 


[r2 oJ 


0 0 4 


6 


The optimal control is 


arg min 

U 


L{v, uj, u) ds 


where 


T f \ ^1 II / n||2 I ^^2 II ||2 W3 2 

L{V,UJ,U) = —\\V -Ve{v,Uj)\\ +—||W|| +—||U||, 

Wi = 1, W 2 = 2, W 3 = 0.5, to = 0, T = 30. 


(26) 


The function Ve(v,uj} represents the optimal attitude reachable from (v,oj). It 
is a known fact that this system is uncontrollable. The desired attitude, in 
our example v = 0, may not be reachable. In the case of H = 0, nonsmooth 
controllers can be derived to stabilize the system 0 Hi, [m, [M 
the case of id 7 ^ 0 , a manifold of reachable states (v, uj) satisfies [8 


28, 


. In 


{Juj — R{v)H) = constant, 
for some C G such that B = 0. 


The attitude Ve{v, uj) is a target attitude in this reachable manifold. A satellite 
system may have to meet multiple requirements of orientation, such as pointing 
sensors to the desired direction and at the same time keeping its solar panel 
facing the Sun. But the desired attitude, for instance u = 0, may not lie on 
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time 



Fig. 6 Trajectories (Solidui; dashed r \ 0, 0 J 2 -, U 2 ] dotted r I ijj, lo^) 


the manifold of reachable states. We define the following optimal Ve{v,uj) as 
the target state for stabilization. For this purpose, we minimize the Frobeiiius 
distance between R{v) and I = R{0). Because both matrices are orthogonal, 
it is equivalent to 


Ue(u,a;) = argmaxtr(i?(i;)) (27a) 

V 

subject to — C^R(v)H = — R{v)H). (27b) 


It can be proved that Ve{v(t),uj(t)), without noise and uncertainty, is a constant 
along any controlled trajectory. Therefore, it can be treated as a constant in 
the derivation of the BVP (5]). The solution of the maximization problem (l?7ll 
can be found by algorithms of numerical nonlinear programming. 

The HJB equation is solved at t = 0 on a sparse grid in the domain 


D = 


{ 


v,uj £ 


TT TT 

< - and 
6 6 


- -S' < wi, W2,a;3 < 



The CGL sparse grid of g = 13 is used. Similar to the approach in Example I, 
we solve the HJB equation at the grid points using 512 CPU cores in parallel. 
Then the accuracy is checked at a random of 1280 point in D. The MAE of 
the relative error is 8.5 x 10“^. A typical trajectory is shown in Figure El 
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0 5 10 15 20 25 30 35 40 45 50 

time 



time 


Fig. 7 Closed-loop trajectories {Solid (f>, u!i, ui; dashed r I 0, 0^21 “2; dotted r I 

i’, ^^ 3 ) 


In an optimal control design based on HJB equations, the most computa¬ 
tionally intensive part, i.e., solving the HJB equation, is done off-line. Once the 
equation is solved, the real-time closed-loop control can be computed using in¬ 
terpolation with a minimum computational load. In the following simulations 
of closed-loop control, the solution of the HJB equation in Example H is in¬ 
tegrated with a MFC to stabilize the system at a desired optimal attitude. 
We adopt a basic zero-order hold MFC controller. The sampling rate is 10 Hz. 
In each time interval the optimal control u* is computed using interpolation 
on the sparse grid. It is assumed that the measurement of v and uj is noise- 
corrupted. The noise has a uniform distribution with a magnitude about 0.5% 
of the maximum state value. Several trajectories under the closed-loop control 
are shown in Figures [THHl 


5.3 Example HI 

In previous examples, the closed-loop control is based on a fixed horizon, [0, T], 
The MFC algorithm uses the value of it(0, x) in feedback. In the following, we 
exemplify the method of computing u(t, x) for t in a given interval [0, ti], where 
ti < T. Because of the causality free property, it is straightforward to include 
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time 



Fig. 8 Closed-loop trajectories (Solid (p, ui; dashed r I 0, a;2, U 2 ‘i dotted r \ 


the time variable as an additional dimension in the sparse grid. Consider the 
following optimal control problem 


il = —Xi + X2, 
X2 = —X2 + 


X3 


I + Xl+ X2 


• I n 2 2 23-2^3 

X3 = ( —2xj + 2 xiX 2 — 2x2 + 


where the cost functional is 


X3 


1 + xf + X 2 / 1 + xf + X. 


+ (1 -f X^ + X2)'U^ 


2 Jto (1 + a;^ + X 2 ) 


dt. 


For any initial condition, a;(to) = an optimal control exists that achieves a 
minimum value, V{to,xo). Let z = (t,x) represents a point in the time-state 
space. The sparse grid is generated in the following region in R x 


D = {z = {t, x) € R X : t G [0, 5] and Xi G [—2, 2], for i = 1, 2,3} . 

For each grid point zj, the minimum value, V, and the associated costate. A, at 
Zj are computed by solving Different from previous examples, the sparse 
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0.4 
0.2 
s 0 
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0 5 10 15 20 25 30 35 40 45 50 

time 



time 

Fig. 9 Closed-loop trajectories ( Solid</>, u!i, ui; dashed r I 9, ut 2 , U 2 ', dotted r I 

i’, 1^3) 



grid is generated in (t, a;)-space so that one can approximate V (t, x) using an 
interpolation for t > 0 in a finite interval. As a result, the MPC feedback can 
be applied using a variable time horizon. 

The problem has a known solution 


V{t,x) = 


2{l + x\+ x\Y 


tanh(T—t), u*{t,x,) 


X3 

(1 +xl +xl) 


tanh(r—t). 


It is used to check the accuracy of the numerical result. In the computation, 
we adopted a CGL sparse grid with q = 12. The number of grid points for 
each dimension is 2'^“'* + 1 = 257. The total number of grid points in the 4-D 
domain is 

|GsVse| = 18,945, 

in comparison to dense grids, a finite difference discretization using 33 points 
in each dimension results in a grid size greater than 1.18 x 10®, which is 
impractical for personal computers. 

At all points in the sparse grid, we set the error tolerance to be 10“^. The 
upper bound of Cinterp + eevp is numerically approximated using the method 
in Section 14.31 More specifically, 1200 points are randomly generated in D. 
At each sample point, the value of V(t, x) is computed using interpolation. 
The MAE is 8.5 x 10“^. Figure [in] is the histogram of the approximation 
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10-2 - 5 - 10 -^ 0 5 - 10 -^ 1 - 10 - 


Approximation error for x) 

Fig. 10 The histogram of approximation errors for V (t, x) in D 




Fig. 11 Closed-loop trajectories (Solid xi and u; dashed r I 2:2; dotted r I 2:3) 

error, which shows a distribution concentrated around the MAE with a small 
variance 2.8 x 10“®. 

In the MFC feedback, we use a variable time window. Figure [TT] is a tra¬ 
jectory under the closed-loop control. The initial time is to = 0. Then the 
control input is updated using a fixed time step. At = 1/7. At each tk < 5, 
the time window for the feedback is T — An interpolation on the sparse 
grid is used to compute \{tk,Xk). Then control input u{tk,Xk) is applied to 
the system. At t = 5, the initial time is set to zero and the process repeats. In 
the simulations, random noise of uniform distribution in [—0.05, 0.05] is added 
to the state variables. 


6 Conclusions 

The characteristics method addressed in this paper is causality free and has 
perfect parallelism. It can be easily integrated in a parallel computer with any 
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grid, such as sparse grids to mitigate the curse of dimensionality. The examples 
show that the algorithm is tractable for 6-D HJB equations. In the case of the 
attitude control of rigid body with six state variables, a solution is computed 
on a sparse grid with about 4.5 x 10® points, whereas the corresponding dense 
grid has more than 10^^ points. In another example, the method is applied to 
a sparse grid in the time-state space so that the solution can be approximated 
within a given time interval. A theorem on the error upper bound is proved. 
In addition to the parallelism, another advantage of the causality free method 
is that the accuracy of the interpolated solution can be numerically checked 
pointwise. The effectiveness of the numerical solution for closed-loop control 
is demonstrated in Example II using a MFC controller. 

For future research, an interesting question is how to generalize the idea to 
problems with an infinite time horizon. Another interesting research topic is 
to solve minimum-time problems. In the presence of multiple solutions, iden¬ 
tifying the globally or locally optimal solution remains a challenge that needs 
to be addressed. 
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